-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-9790: [Rust][Parquet]: Increase test coverage in arrow_reader.rs #8009
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the basic idea -- drive the test with a table that is shared between bool, utf, and fixed_len_binary_array types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you felt the need to add a comment explaining the values, a tiny suggestion is to declare all members of TestOptions as pub and use
TestOptions {
num_row_groups: 3,
num_rows: 25,
record_batch_size: 5,
num_iterations: 50,
}
instead, to increase readability of the values' meaning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea -- I will do so before turning this into a full on PR for review
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in f149500
3967510 to
f149500
Compare
f149500 to
bc384b3
Compare
| } | ||
|
|
||
| #[test] | ||
| fn test_bool_single_column_reader_test_batch_size_divides_into_row_group_size() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The coverage of these tests was moved into two table entries here: https://github.com/apache/arrow/pull/8009/files#diff-4c103051156d7b901fad8d9e26104932R393
jorgecarleitao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I added two test cases in #8007, which increased coverage. However, upon further review, I noticed the choice of parameters to hit edge conditions didn't cover the string data types.
Rather than adding a bunch more copies of basically the same test to add new parameters for different tests, I instead propose using the same set of parameters for all data types and drive the tests using a table in this PR.
It makes the test logic slightly harder to follow, in my opinion, but it does increase coverage